Process Mining by Measuring Process Block Similarity

نویسندگان

  • Joonsoo Bae
  • James Caverlee
  • Ling Liu
  • Hua Yan
چکیده

Mining, discovering, and integrating process-oriented services has attracted growing attention in the recent year. Workflow precedence graph and workflow block structures are two important factors for comparing and mining processes based on distance similarity measure. Some existing work has done on comparing workflow designs based on their precedence graphs. However, there lacks of standard distance metrics for comparing workflows that contain complex block structures such as parallel OR, parallel AND. In this paper we present a quantitative approach to modeling and capturing the similarity and dissimilarity between different workflow designs, focusing on similarity and dissimilarity between the block structures of different workflow designs. We derive the distance-based similarity measures by analyzing the workflow block structure of the participating workflow processes in four consecutive phases. We first convert each workflow dependency graph into a block tree by using our block detection algorithm. Second, we transform the block tree into a binary tree to provide a normalized reference structure for distance based similarity analysis. Third, we construct a binary branch vector by encoding the binary tree. Finally, we calculate the distance metric between two binary branch vectors. Our initial experience shows that this distance measure can be used as a quantitative and qualitative tool for understanding and detecting block structure similarity and dissimilarity between two workflow designs. It can be effectively combined with a workflow precedence based similarity analysis tool in process mining, process merging, and process clustering, and ultimately it can reduce or minimize the costs involved in design, analysis, and evolution of workflow systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

A Hybrid Fuzzy MCDM Approach to Determine an Optimal Block Size in Open-Pit Mine Modeling: a Case Study

The computer-based 3D modeling of ore bodies is one of the most important steps in the resource estimation, grade determination, and production scheduling of open-pit mines. In the modeling phase, the volume of the orebody model is required to be filled by the blocks and sub-blocks. The determination of Block Size (BS) is important due to the dependence of the geostatistical issues and calculat...

متن کامل

Estimation of metallurgical parameters of flotation process from froth visual features

The estimation of metallurgical parameters of flotation process from froth visual features is the ultimate goal of a machine vision based control system. In this study, a batch flotation system was operated under different process conditions and metallurgical parameters and froth image data were determined simultaneously. Algorithms have been developed for measuring textural and physical froth ...

متن کامل

A graph distance based metric for data oriented workflow retrieval with variable time constraints

There are many applications in business process management that require measuring the similarity between business processes, such as workflow retrieval and process mining, etc. However, most existing approaches and models cannot represent variable constraints and achieve data oriented workflow retrieval of considering different QoS requirements, and also fail to allow users to express arbitrary...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006